Critical difference diagram based on results from post-hoc Nemenyi tests:
Output from Signed Rank Test
Output from test on multiple datasets:
OULAD (Red & 2013-2014), vs 2015 data (Blue)
Example Map of IMD
OULAD (Red & 2013-2014), vs 2015 data (Blue)
Showing 10 rows from "landing"."studentInfo"
| code_module | code_presentation | id_student | gender | region | highest_education | imd_band | age_band | num_of_prev_attempts | studied_credits | disability | final_result | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | AAA | 2013J | 11391 | M | East Anglian Region | HE Qualification | 90-100% | 55<= | 0 | 240 | N | Pass |
| 1 | AAA | 2013J | 28400 | F | Scotland | HE Qualification | 20-30% | 35-55 | 0 | 60 | N | Pass |
| 2 | AAA | 2013J | 30268 | F | North Western Region | A Level or Equivalent | 30-40% | 35-55 | 0 | 60 | Y | Withdrawn |
| 3 | AAA | 2013J | 31604 | F | South East Region | A Level or Equivalent | 50-60% | 35-55 | 0 | 60 | N | Pass |
| 4 | AAA | 2013J | 32885 | F | West Midlands Region | Lower Than A Level | 50-60% | 0-35 | 0 | 60 | N | Pass |
| 5 | AAA | 2013J | 38053 | M | Wales | A Level or Equivalent | 80-90% | 35-55 | 0 | 60 | N | Pass |
| 6 | AAA | 2013J | 45462 | M | Scotland | HE Qualification | 30-40% | 0-35 | 0 | 60 | N | Pass |
| 7 | AAA | 2013J | 45642 | F | North Western Region | A Level or Equivalent | 90-100% | 0-35 | 0 | 120 | N | Pass |
| 8 | AAA | 2013J | 52130 | F | East Anglian Region | A Level or Equivalent | 70-80% | 0-35 | 0 | 90 | N | Pass |
| 9 | AAA | 2013J | 53025 | M | North Region | Post Graduate Qualification | None | 55<= | 0 | 60 | N | Pass |
Showing 10 rows from main.student_info
| id | orig_student_id | course_id | module_id | presentation_id | age_band_id | imd_band_id | highest_education_id | region_id | final_result_id | is_female | has_disability | date_registration | date_unregistration | studied_credits | num_of_prev_attempts | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 32588 | 3733 | 10 | 4 | 2 | 3 | 10 | 2 | 9 | 4 | 0 | 0 | -68 | -8.0 | 60 | 0 |
| 1 | 22291 | 6516 | 2 | 1 | 4 | 3 | 9 | 2 | 7 | 3 | 0 | 0 | -52 | NaN | 60 | 0 |
| 2 | 32538 | 8462 | 9 | 4 | 4 | 3 | 4 | 2 | 4 | 4 | 0 | 0 | -38 | 18.0 | 60 | 1 |
| 3 | 32539 | 8462 | 10 | 4 | 2 | 3 | 4 | 2 | 4 | 4 | 0 | 0 | -137 | 119.0 | 90 | 0 |
| 4 | 22281 | 11391 | 1 | 1 | 2 | 3 | 10 | 2 | 1 | 3 | 0 | 0 | -159 | NaN | 240 | 0 |
| 5 | 4461 | 23629 | 4 | 2 | 1 | 1 | 3 | 3 | 1 | 2 | 1 | 0 | -47 | NaN | 60 | 2 |
| 6 | 26116 | 23632 | 3 | 2 | 2 | 1 | 5 | 1 | 1 | 4 | 1 | 0 | -194 | -51.0 | 60 | 0 |
| 7 | 14301 | 23698 | 7 | 3 | 4 | 1 | 6 | 1 | 1 | 3 | 1 | 0 | -110 | NaN | 120 | 0 |
| 8 | 994 | 23798 | 3 | 2 | 2 | 1 | 6 | 1 | 11 | 1 | 0 | 0 | -27 | NaN | 60 | 0 |
| 9 | 11595 | 24186 | 20 | 7 | 3 | 1 | 2 | 3 | 13 | 3 | 1 | 1 | -25 | NaN | 30 | 0 |
Example of GridSearch Cross-Validation for hxg_boost: clf__learning_rate [0.1] clf__random_state [None] clf__learning_rate [0.01] clf__random_state [None] clf__learning_rate [0.001] clf__random_state [None]
Example of RandomizedSearch Cross-Validation for dtree: clf__splitter ['best'] clf__random_state [None] clf__min_samples_split [61] clf__min_samples_leaf [8] clf__max_features ['sqrt'] clf__max_depth [16] clf__criterion ['log_loss']
Example of RandomizedSearch Cross-Validation for ada_boost: clf__learning_rate [0.009607840680411647] clf__random_state [None]
Example of RandomizedSearch Cross-Validation for hxg_boost: clf__interaction_cst ['no_interactions'] clf__l2_regularization [0.3040953370005851] clf__learning_rate [0.03922058903998265] clf__max_bins [195] clf__max_depth [45] clf__max_iter [74] clf__min_samples_leaf [8] clf__random_state [None] clf__warm_start [True]
Example of RandomizedSearch Cross-Validation for rforest: clf__bootstrap [True] clf__criterion ['entropy'] clf__max_features ['log2'] clf__max_samples [0.12272488337757193] clf__min_samples_leaf [5] clf__min_samples_split [9] clf__n_estimators [52] clf__n_jobs [-1] clf__oob_score [True] clf__random_state [None]
Example of RandomizedSearch Cross-Validation for etree: clf__bootstrap [True] clf__criterion ['gini'] clf__max_features ['sqrt'] clf__max_samples [0.1400021301694605] clf__min_samples_leaf [9] clf__min_samples_split [5] clf__n_estimators [103] clf__n_jobs [-1] clf__oob_score [True] clf__random_state [None]
Example of RandomizedSearch Cross-Validation for knn: clf__weights ['distance'] clf__p [1] clf__n_neighbors [4] clf__n_jobs [-1] clf__leaf_size [56] clf__algorithm ['ball_tree']
Example of RandomizedSearch Cross-Validation for logreg: clf__C [0.3234846473870656] clf__penalty ['l2'] clf__random_state [None] clf__solver ['liblinear']
Example of RandomizedSearch Cross-Validation for mlp: clf__activation ['logistic'] clf__alpha [0.0014749079430959996] clf__early_stopping [False] clf__hidden_layer_sizes [165] clf__learning_rate ['invscaling'] clf__learning_rate_init [0.008378338921501449] clf__max_iter [63] clf__power_t [0.02097768921985744] clf__random_state [None] clf__solver ['sgd']
Example of RandomizedSearch Cross-Validation for svc: clf__C [0.08503983103378839] clf__degree [2] clf__gamma ['scale'] clf__kernel ['rbf'] clf__probability [True] clf__random_state [None]
Example of RandomizedSearch Cross-Validation for compnb: clf__alpha [0.0032502548421112333] clf__norm [True]
| model_type | mean_fit_time | std_fit_time | mean_score_time | std_score_time | mean_test_roc_auc | std_test_roc_auc | |
|---|---|---|---|---|---|---|---|
| 0 | rforest | 43.546772 | 6.621612 | 0.308331 | 0.079489 | 0.773876 | 0.006709 |
| 1 | etree | 22.712446 | 2.100217 | 0.221903 | 0.027837 | 0.771116 | 0.004962 |
| 2 | hxg_boost | 8.673935 | 0.686707 | 0.297604 | 0.038754 | 0.770126 | 0.004659 |
| 3 | mlp | 13.990430 | 0.143472 | 0.077275 | 0.011556 | 0.769375 | 0.007105 |
| 4 | hxg_boost | 6.469628 | 1.027840 | 0.263747 | 0.042775 | 0.769049 | 0.006249 |
| 5 | etree | 14.824491 | 3.315720 | 0.251914 | 0.041687 | 0.767965 | 0.004917 |
| 6 | hxg_boost | 1.569689 | 0.177629 | 0.086444 | 0.013947 | 0.767947 | 0.005761 |
| 7 | mlp | 1.304684 | 0.044562 | 0.029652 | 0.002250 | 0.767719 | 0.008325 |
| 8 | rforest | 21.928703 | 4.551270 | 0.763771 | 0.412899 | 0.767640 | 0.005182 |
| 9 | hxg_boost | 1.494005 | 0.194275 | 0.075681 | 0.012553 | 0.767497 | 0.004604 |